{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/airctic/icevision/blob/master/notebooks/getting_started_semantic_segmentation.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ac49b0c8",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Getting Started with Semantic Segmentation using IceVision"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "99cab1f7",
   "metadata": {},
   "source": [
    "## Introduction to IceVision\n",
    "IceVision is a Framework for object detection, instance segmentation and semantic segmentation that makes it easier to prepare data, train an object detection model, and use that model for inference.\n",
    "\n",
    "The IceVision Framework provides a layer across multiple deep learning engines, libraries, models, and data sets.\n",
    "\n",
    "It enables you to work with multiple training engines, including [fastai](https://github.com/fastai/fastai), and [pytorch-lightning](https://github.com/PyTorchLightning/pytorch-lightning).\n",
    "\n",
    "It enables you to work with some of the best deep learning libraries including [mmdetection](https://arxiv.org/abs/1906.07155), [Ross Wightman's efficientdet implementation](https://github.com/rwightman/efficientdet-pytorch) and model library, [torchvision](https://pytorch.org/vision/stable/index.html), [ultralytics Yolo](https://github.com/ultralytics/yolov5), and [mmsegmentation](https://github.com/open-mmlab/mmsegmentation).\n",
    "\n",
    "It enables you to select from many possible models and backbones from these libraries.\n",
    "\n",
    "IceVision lets you switch between them with ease. This means that you can pick the engine, library, model, and data format that work for you now and easily change them in the future. You can experiment with with them to see which ones meet your requirements.\n",
    "\n",
    "## Getting Started with Semantic Segmentation\n",
    "\n",
    "This notebook will walk you through the training of models for **semantic segmentation** - a task that consists in classifying each pixel of an image into one of multiple classes. \n",
    "\n",
    "In this tutorial, you will learn how to  \n",
    "1. Install IceVision. This will include the IceData package that provides easy access to several sample datasets, as well as the engines and libraries that IceVision works with.  \n",
    "2. Download and prepare a dataset to work with.  \n",
    "3. Select an object detection library, model, and backbone.  \n",
    "4. Instantiate the model, and then train it with both the fastai engine.  \n",
    "5. And finally, use the model to identify objects in images.  \n",
    "\n",
    "The notebook is set up so that you can easily select different libraries, models, and backbones to try."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Install IceVision and IceData"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3348552e",
   "metadata": {},
   "source": [
    "The following downloads and runs a short shell script. The script installs IceVision, IceData, the MMDetection library, the MMSegmentation library and Yolo v5 as well as the fastai and pytorch lightning engines."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0e0bb5e5",
   "metadata": {},
   "source": [
    "Install from pypi..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Install from pypi..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# # Torch - Torchvision - IceVision - IceData - MMDetection - YOLOv5 - EfficientDet - mmsegmentation Installation\n",
    "# !wget https://raw.githubusercontent.com/airctic/icevision/master/icevision_install.sh\n",
    "\n",
    "# # Choose your installation target: cuda11 or cuda10 or cpu\n",
    "# !bash icevision_install.sh cuda11"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6843bb7c",
   "metadata": {},
   "source": [
    "... or from icevision master"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2674e9ff",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Torch - Torchvision - IceVision - IceData - MMDetection - YOLOv5 - EfficientDet - mmsegmentation Installation\n",
    "!wget https://raw.githubusercontent.com/airctic/icevision/master/icevision_install.sh\n",
    "\n",
    "# Choose your installation target: cuda11 or cuda10 or cpu\n",
    "!bash icevision_install.sh cuda11 master"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c329894a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Restart kernel after installation\n",
    "import IPython\n",
    "IPython.Application.instance().kernel.do_shutdown(True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Imports\n",
    "All of the IceVision components can be easily imported with a single line.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from icevision.all import *"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13d6ff3e",
   "metadata": {},
   "source": [
    "### Alternative option of importing icevision\n",
    "\n",
    "In some environments wildcard imports are not allowed (eg: Kubeflow Pipelines & distributed environment that rely on pickling). \n",
    "\n",
    "In that case you can call `import icevision.all as iv`  instead of `from icevision.all import *`\n",
    "\n",
    "In that case you should reference icevision objects with the `iv.` prefix, so for instance one would do `iv.Dataset` instead of the regular approach of just doing `Dataset`, same for any other icevision object (eg: `iv.tfms.A.Adapter` instead of `tfms.A.Adapter` etc)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Download and prepare a dataset\n",
    "Now we can start by downloading the camvid tiny dataset, which contains . This tiny dataset contains 100 images whose pixels are classified in 33 classes, including:\n",
    "- animal,\n",
    "- car,\n",
    "- bridge,\n",
    "- building.\n",
    "\n",
    "IceVision provides methods to load a dataset, parse annotation files, and more."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "96f2e935",
   "metadata": {},
   "source": [
    "Download the camvid tiny dataset and load it using `icedata`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Download data\n",
    "data_url = 'https://s3.amazonaws.com/fast-ai-sample/camvid_tiny.tgz'\n",
    "data_dir = icedata.load_data(data_url, 'camvid_tiny') / 'camvid_tiny'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "00405d17",
   "metadata": {},
   "source": [
    "Retrieve class codes from dataset file and create a class map (a structure that maps a class identifier, in this case an integer, to the actual class)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "codes = np.loadtxt(data_dir/'codes.txt', dtype=str)\n",
    "class_map = ClassMap(list(codes))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e824e164",
   "metadata": {},
   "source": [
    "Get images files"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "images_dir = data_dir/'images'\n",
    "labels_dir = data_dir/'labels'\n",
    "image_files = get_image_files(images_dir)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "74f4990f",
   "metadata": {},
   "source": [
    "## Parse the dataset\n",
    "\n",
    "A unit of data in IceVision is called a record, which contains all the information required to handle a given image (e.g. path to the image, segmentation masks, class map, etc..).\n",
    "\n",
    "Here, we build a collection of records by iterating through the image files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "records = RecordCollection(SemanticSegmentationRecord)\n",
    "\n",
    "for image_file in pbar(image_files):\n",
    "    record = records.get_by_record_id(image_file.stem)\n",
    "\n",
    "    if record.is_new:\n",
    "        record.set_filepath(image_file)\n",
    "        record.set_img_size(get_img_size(image_file))\n",
    "        record.segmentation.set_class_map(class_map)\n",
    "\n",
    "    mask_file = SemanticMaskFile(labels_dir / f'{image_file.stem}_P.png')\n",
    "    record.segmentation.set_mask(mask_file)\n",
    "    \n",
    "records = records.autofix()\n",
    "train_records, valid_records = records.make_splits(RandomSplitter([0.8, 0.2]))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c0a11ef1",
   "metadata": {},
   "source": [
    "## Take a peak at records\n",
    "\n",
    "Using `show_records`, we can preview the content of the records we created"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "sample_records = random.choices(records, k=3)\n",
    "show_records(sample_records, ncols=3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating datasets with augmentations and transforms\n",
    "\n",
    "Data augmentations are essential for robust training and results on many datasets and deep learning tasks. IceVision ships with the [Albumentations](https://albumentations.ai/docs/) library for defining and executing transformations, but can be extended to use others.\n",
    "\n",
    "For this tutorial, we apply the Albumentation's default `aug_tfms` to the training set. `aug_tfms` randomly applies broadly useful transformations including rotation, cropping, horizontal flips, and more. See the Albumentations documentation to learn how to customize each transformation more fully.\n",
    "\n",
    "The validation set is only resized (with padding).\n",
    "\n",
    "We then create `Datasets` for both. The dataset applies the transforms to the annotations (such as bounding boxes) and images in the data records."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "presize, size = 512, 384\n",
    "presize, size = ImgSize(presize, int(presize*.75)), ImgSize(size, int(size*.75))\n",
    "\n",
    "aug_tfms = tfms.A.aug_tfms(presize=presize, size=size, pad=None,\n",
    "                           crop_fn=partial(tfms.A.RandomCrop, p=0.5),\n",
    "                           shift_scale_rotate=tfms.A.ShiftScaleRotate(rotate_limit=2),\n",
    "                          )\n",
    "train_tfms = tfms.A.Adapter([*aug_tfms, tfms.A.Normalize()])\n",
    "valid_tfms = tfms.A.Adapter([tfms.A.resize(size), tfms.A.Normalize()])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_ds = Dataset(train_records, train_tfms)\n",
    "valid_ds = Dataset(valid_records, valid_tfms)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3e9f8f83",
   "metadata": {},
   "source": [
    "### Understanding the transforms\n",
    "\n",
    "The Dataset transforms are only applied when we grab (get) an item. Several of the default `aug_tfms` have a random element to them. For example, one might perform a rotation with probability 0.5 where the angle of rotation  is randomly selected between +45 and -45 degrees.\n",
    "\n",
    "This means that the learner sees a slightly different version of an image each time it is accessed. This effectively increases the size of the dataset and improves learning.\n",
    "\n",
    "We can look at result of getting the 0th image from the dataset a few times and see the differences. Each time you run the next cell, you will see different results due to the random element in applying transformations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds_samples = [train_ds[0] for _ in range(3)]\n",
    "show_samples(ds_samples, ncols=3)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e7701202",
   "metadata": {},
   "source": [
    "# Select a library, model, and backbone\n",
    "\n",
    "In order to create a model, we need to:\n",
    "\n",
    "- Choose one of the **libraries** supported by IceVision\n",
    "- Choose one of the **models** supported by the library\n",
    "- Choose one of the **backbones** corresponding to a chosen model\n",
    "\n",
    "You can access any supported models by following the IceVision unified API, use code completion to explore the available models for each library."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Creating a model\n",
    "\n",
    "Selections only take two simple lines of code. For example, to try the `mmsegmentation` library using the `deeplabv3` model and the `resnet50_d8` backbone could be specified by:\n",
    "\n",
    "```python\n",
    "model_type = models.mmseg.deeplab3\n",
    "backbone = model_type.backbones.backbones.resnet50_d8\n",
    "```\n",
    "\n",
    "As pretrained models are used by default, we typically leave this out of the backbone creation step.\n",
    "\n",
    "We've selected a few of the many options below. You can easily pick which option you want to try by setting the value of `selection`. This shows you how easy it is to try new libraries, models, and backbones."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0209ad39",
   "metadata": {},
   "outputs": [],
   "source": [
    "selection = 0\n",
    "\n",
    "if selection == 0:\n",
    "  model_type = models.fastai.unet\n",
    "  backbone = model_type.backbones.resnet34(pretrained=True)\n",
    "  model = model_type.model(backbone=backbone, num_classes=class_map.num_classes, img_size=size)\n",
    "\n",
    "if selection == 1:\n",
    "  model_type = models.mmseg.deeplabv3\n",
    "  backbone = model_type.backbones.resnet50_d8(pretrained=True)\n",
    "  model = model_type.model(backbone=backbone, num_classes=class_map.num_classes)\n",
    "\n",
    "if selection == 2:\n",
    "  model_type = models.mmseg.deeplabv3\n",
    "  backbone = model_type.backbones.resnet50_d8(pretrained=True)\n",
    "  model = model_type.model(backbone=backbone, num_classes=class_map.num_classes)\n",
    "\n",
    "if selection == 3:\n",
    "  model_type = models.mmseg.segformer\n",
    "  backbone = model_type.backbones.mit_b0(pretrained=True)\n",
    "  model = model_type.model(backbone=backbone, num_classes=class_map.num_classes)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84dd5160",
   "metadata": {},
   "source": [
    "## Data Loader\n",
    "The Data Loader is specific to a model_type. The job of the data loader is to get items from a dataset and batch them up in the specific format required by each model. This is why creating the data loaders is separated from creating the datasets.\n",
    "\n",
    "We can take a look at the first batch of items from the `valid_dl`. Remember that the `valid_tfms` only resized (with padding) and normalized records, so different images, for example, are not returned each time. This is important to provide consistent validation during training."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Data Loaders\n",
    "train_dl = model_type.train_dl(train_ds, batch_size=8, num_workers=4, shuffle=True)\n",
    "valid_dl = model_type.valid_dl(valid_ds, batch_size=8, num_workers=4, shuffle=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9e9f24bc",
   "metadata": {},
   "outputs": [],
   "source": [
    "# show batch\n",
    "model_type.show_batch(first(valid_dl), ncols=4)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "241ac876",
   "metadata": {},
   "source": [
    "## Metrics\n",
    "\n",
    "The fastai and pytorch lightning engines collect metrics to track progress during training. IceVision provides metric classes that work across the engines and libraries.\n",
    "\n",
    "The same metrics can be used for both fastai and pytorch lightning.\n",
    "\n",
    "As this is a segmentation problem, we are going to use two metrics: multi-class Diece coefficient, and segmentation accuracy. Note that we are ignoring \"void\" when computing accuracy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "b87644f5",
   "metadata": {},
   "outputs": [],
   "source": [
    "metrics = [MulticlassDiceCoefficient(), SegmentationAccuracy(ignore_class=class_map.get_by_name(\"Void\"))]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Training\n",
    "\n",
    "IceVision is an agnostic framework meaning it can be plugged into other DL learning engines such as [fastai2](https://github.com/fastai/fastai2), and [pytorch-lightning](https://github.com/PyTorchLightning/pytorch-lightning).  \n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "513f3b3e",
   "metadata": {},
   "source": [
    "### Training using fastai"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "learn = model_type.fastai.learner(dls=[train_dl, valid_dl], model=model, metrics=metrics)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2163f2ad",
   "metadata": {},
   "source": [
    "Because we use fastai, we get access to its features such as the learning rate finder"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "28626974",
   "metadata": {},
   "outputs": [],
   "source": [
    "learn.lr_find()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "learn.fine_tune(10, 1e-4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using the model - inference and showing results\n",
    "\n",
    "The first step in reviewing the model is to show results from the validation dataset. This is easy to do with the `show_results` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1d96bdf",
   "metadata": {},
   "outputs": [],
   "source": [
    "model_type.show_results(model, valid_ds, num_samples=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2ab7d46a",
   "metadata": {},
   "source": [
    "### Prediction\n",
    "\n",
    "Sometimes you want to have more control than `show_results` provides. You can construct an inference dataloader using `infer_dl` from any IceVision dataset and pass this to `predict_dl` and use `show_preds` to look at the predictions.\n",
    "\n",
    "A prediction is returned as a dict with keys: `scores`, `labels`, `bboxes`, and possibly `masks`. \n",
    "\n",
    "Prediction functions that take a `keep_images` argument will only return the (tensor representation of the) image when it is `True`. In interactive environments, such as a notebook, it is helpful to see the image with bounding boxes and labels applied. In a deployment context, however, it is typically more useful (and efficient) to return the bounding boxes by themselves.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "infer_dl = model_type.infer_dl(valid_ds, batch_size=4, shuffle=False)\n",
    "preds = model_type.predict_from_dl(model, infer_dl, keep_images=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_sample(preds[0].pred)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e69d2dd6",
   "metadata": {},
   "source": [
    "## Happy Learning!\n",
    "\n",
    "If you need any assistance, feel free to join our [forum](https://discord.gg/JDBeZYK)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.10.4 64-bit",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.4"
  },
  "vscode": {
   "interpreter": {
    "hash": "916dbcbb3f70747c44a77c7bcd40155683ae19c65e1c03b4aa3499c5328201f1"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
